DOCUMENT RESlflffi 



ED 283 842 



TM 870 355 



AUTHOR 
TITLE 

INSTITUTION 
REPORT NO 
PUB PATE 
NOTE 
PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Mills, Craig N. ; Melican, Gerald J. 

A Preliminary Investigation of Three Compromise 

Methods for Establishing Cut-Off Scores* 

Educational Testing Service, Princeton, N»J a 

ETS-RR-87-14 

Mar 87 

28p. 

Reports - Research/Technical (143) 
*ffi01/PC02 Plus Postage. 

^Academic Standards ; Certification; Comparative 
Analysis^ *Cutting Scores; *Difficulty Level; 
* Judges; Knowledge Level; Licensing Examinations 
(Professions); ^Mastery Tests; Statistical 
Distributions; Statistical Studies; *Test Items 
Angoff Methods; * Compromise Model (Hofstee) 



ABSTRACT 

The study compares three methods for establishing 
cut-off scores that effect a compromise between absolute cut-offs 
based on item difficulty and relative cutoffs based on expected 
passing rates. Each method coordinates these two types of information 
differently. The Beuk method obtains judges' estimates of an absolute 
cut-off and an expected passing rate, and constructs a cutting line 
whose slope is the ratio of the absolute and relative standard 
deviations and which passes through the point of mean 
absolute/relative cut-off . The judges can be either test-oriented or 
examinee-oriented depending on whether they show greater agreement 
(small standard deviations) on the absolute or relative cut-offs. The 
Hofstee method draws a cutting line through two extreme points t (1) 
maximum cut-off, minimum failure point; and (2) minimum cut-off, 
maximum failure point. The DeGruijter method is simil: r to the Beuk 
method, but uses confidence estimates for the absolute and relative 
cut-offs to define a criterion ellipse. These methods were applied to 
two tests from a certification program. Judges rated item difficulty 
by the Angoff method and estimated a desirable passing rate. All 
three compromise methods brought the cut-off two points below the 
absolute level, in line with an acceptable passing rate. This study 
suggests that further research into all three of the compromise 
methods is needed. (LPG) 



**************************************************** ^ 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
*********************************** 



ERLC 



BEST COPY AVAILABLE 



S 
E 
A 
R 
C 
H 



P 

O 



A PRELIMINARY INVESTIGATION OF THREE 
COMPROMISE METHODS FOR ESTABLISHING 
CUT-OFF SCORES 



Craig N* Mills 
Gerald J. Melican 



U.S. DEPARTMENT OF EDUCATION 

Office ol idueaiisnei Research and Improvement 

EDUCATIONAL RE^OUTCEH INFORMATION 
F CENTER (ERIC) 

This dee urn en t has been reproduced as 
received from the person of organization 
originating it 
□ Minor changes have been made is improve 
reproduction quality, 

= • Points o) view or opinions stated In this deey= 
ment dp not neceysarily represent official 
OERI position or policy. 




Educational Testing Service 
Princeton, New Jersey 
March 1887 



"PERMISSION TO REPRODUCE THIS 
MATERIAL IN MICROFICHE ONLY 
HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)," 



A Preliminary Investigation of Three- Compromise Methods 
for Establishing Cut-Off Scores 

Craig N. Mills 
Gerald J. Melican 

Educational Testing Service 



3 



Copyright (|) 1987, Educational Testing Service. All rights reserved. 



ERIC 



Jsnuatf^ 2S, 1987 



A Preliminary Investigation of Thrae Compromise Methods 
for Establishing Gut -off Scores 

Craig N, Mills 
and 

Gerald J, Meliean 
The determination of passing points for tests used to make dlchotomous 
decisions about individuals has been an area of concern for sojsa tins, Many 
methods have been proposed and used to establish cut-off scores* Hiese 
methods have been grouped into two major classes: those that in^lu<ii judgment i 
about examinees and those that require judgments about itams (Livingston £$ 
Zieky, 1982), Most cut-off scora studies in the araa of licensure and 
certification have fallen into the latter category; that is, metH^di that 
require judgments about items have been predominant. The most widely used 
methods have been che Nedelsky, Angoff, and Ebel methods. 

One problem associated with the use of item judgment methods has been the 
low to moderate relationship of raters' perceptions of item difficulty to the 
actual item difficulty. Although different training procedures have been used 
with varying degrees of success (e*g* Bajar, 1983 , Thorndike , 1982 s Urge and 
Kruglov f 1953) , the accuracy of judges' estimates of item difficulty is 
generally regarded as less than adequate. As a result , cut-off ^core studies 
are typically viewed as tentative and the cut-off scores obtained Em 
standard setting studies are routinely adjusted. Traditionally , cut-off 
scores have been adjusted by lowering (or raising) them by a ffulti^leof the 
standard error of measurement or the conditional standard errot naiiuremsnc 

Paga 1 



ERLC 



5 



^uai 26, 1987 



(Lord, 1984) at the cut-off point, The argument s to n\uw ad ustments 

using the standard error of measurement are usua j. 1 v made \.< tar ns of the typ 
of error least objectionable to the sponsor of - t^/_. fci example, if th**B 
sponsor feels that it is not as harmful to pans - -ar- ; ash whose ability is 
actually lower than the cut-off as te fail an * ^ - l ^it\m who scorad poorl^y, 
the cut-off may be lowered, 

A decision to adjust a cut-off based on an ivt ligation of the relative 
costs of false positive and false negative errors ma^ be appropriate, however, 
the passing rates associated with the alternative cut-off scores are often 
inappropriately considered as part of this aviiuat lorn . The passing rates arena 
not directly related to the numbers of erren ©f either type (false positive 
or false negative) possible for the various alternative cuts and, as such, ttoa 
passing rates are often reviewed in an inappropriate Manner, a manner more 
consistent with the setting of the cut-off score to pass a fixed percentage c— £ 
examinees. This is not to say, however, that the client's or judges' 
estimates of the percent of examinees In the entry population that should pas. s 
is not important information, To the contrary, these estimates are important; 
collateral information because they are generally basi&d on solid observation 
of the examinee population. The manner in which this information is to be 
incorporated into the decision regarding the placement of the cut-off is an 
issue to be considered. 

Three methods of adjusting absolute cut-off scofas that incorporate 

Page 2 



e 



ERIC 



January 26, 1987 



expected passing percentages have rec^ently been proposed in two contributions 
to the literaturs (Beuk, 1984, DeQruiJB ter , 1985), These methods purport to 
provide compromises between an absolti^=e cut-off score (a cut-off based solely 
on the examinee*! performance on the teast) and a relative one (a cut-off based 
solely upon the examinee 9 s ranking in some group). These compromise methods, 
if they yield acceptable cut-offs, maf^ offer standardized ways of adjusting 
cut-off scores, This paper is writter», to explain the methods and to present 
preliminary data differences in the supplication of them* 

Compromise ^Methods 

In this paper, three compromise nt^ethods will be explained and 
illustrated. The word ,, compromise ,l is used to mean methods that incorporate 
both judges' titimates of an absolute *^ut-eff and their estimates of the pass 
rates in establishing a cut-off BQotm, 

The Beuk Method 

Beuk (1986) presented a method in which the cut-off is adjusted is a 
function of the degree to which the jtwrfge group is identified as "test 
oriented" or '♦examinee oriented." In order to use the Beuk method, two pieces 
of information are required from each udEe : a cut-off score and an estimate 
of the percent of the candidate group fc^hat should pass. The score point 



January 26, 1987 



associated with the estimiL .tt of percent pais is 2T«£erred to as the relative 
cut-off score, i,e # the cu.-_t:-off score chat would result in the pass rate 
predicted by the judges* -^Additionally, a distribution of scores is required, 

Iti order to determine whether a group of judges is test or examinee 
oriented^ ona compares the standard deviations of their item ratings and their 
percent pass ratings. If am smaller standard deviation is rioted for the 
absolute Cut-off score da tuna than for the expected percent passing data, the 
group of judges is consideared to be test eriented- If the reverse is trufi 
the group is called examine=e oriented, Hie ratio of the standard deviations 
of the rating! en the absoliute and relative dimensions is then used to adjust 
the out- off as described he=low: 

-a cumulative distrlbi^tion of the peresnt of the examinee group 
fallttig below each po»ssibie cut-off icore C&aeh test score is a 
possible cut-off) is plotted, 

-a point corresponding; to the mean cut-off scora and the mean 
expected pass rate is - plotted on the graph, 

= a line with a slope e*^ qual to the ratio of tti^ standard deviations 
(i.e. the standard de^viation of the judges 9 ^Kpeeted pass rates 
divided by the standairzrd deviation of their cut-off scores) is 
drawn through the poi^mt representing the two means, 

Page 4 



s 



January 26 , 1987 



The point where the line passes through thfc cumulative distribution of 
scores is the adjusted cut-off. In this report, the Beu"S<c method is 
demonstrated in terms of a cumulative distribution of percent fail to maintain 
continuity with DeGruijter (1985) who describes all three methods in terms of 
percent fail* 

The adjusted cut-off score via the Beuk method will be adjusted in 
relation to the two suggested cut -off scores (the absolute© and relative cuts). 
If the standard deviation for the absolute cut-off is th^ smaller of the two 
standard deviations, the adjusted cut-off will b e closer to the absolute 
cut-off score than to the expected percent pas si tig eut-of=~f score. In this 
way, the Beuk method favors the cut-off score technique fiTor which the judges 
show the stronger agreement (smaller standard deviation) and the adjusted 
cut-off score will tend toward one or the ortit of the ei^t-off scores as the 
ratio of the staudard deviations departs frott unity. 

The Hofstee Method 

Hofstee's method may be used as a method fot letting a cut-off score in 
addition to its use as a method for adjusting onfe t Four judgments are 
required from each judge using Hofstee's method: 



Page 5 



9 



January 26 t 1987 



-an estimate of the maximum acceptable value of the cut-off 
^ max ) • the cut-off should not be higher than this point even if 
every examinee passes the test, 

-an estimate of the minimum acceptable value of the cut -off 
^ C min' ' ths cut-off should not be lower than this point even if 
no examinee passes , 

-an estimate of the minimum acceptable failure rata (F . ) f and 

mln 1 

-an estimate of tha maximum acceptable failure rata (F ) 

max " 

Two points are plotted: the maximum cut-off, minimum failure point and 
the minimum cut-off, maximum failure point. The line-segment that connects 
the points establishes acceptable combinations of cut-off scores and failura 
rates, A cumulative distribution of percent failing is plotted on the same 
graph. The point where the cumulative distribution intersects tha 
line- segment becomes the cut-off. 



ERLC 



The Hofstee method evaluates "worst case" possibilities! "Based on tha 
information provided in the responses to the cut-off score and failure rate 
questions, we would be willing to accept a cut-off icora as high as C 

max 

provided the failure rate did not exceed F . Further, we would accept a 

max 

cut-off score as low as C provided the failure rate was at least F 

mm n 



mm 



Page 



10 



January 26 t 1987 



rTie points on the line- segment connecting the extremes C ,F and G F 

max' min min* max 

ir=<epresent the acceptable alternative combinations of evt-off scores and 
p~*assing rates. The point on this line -segment that coincides with the ogive 
i„<s the point where the judges are in agreement with th« observed data. 

The possibility exists that the line -segment established via the Hofstee 
method may not cross the ogive indicating that the judgei ? estimates of the 
r^Mge of possible cut-off scores or the range of possible passing rates (or 
b<aoth) were inconsistent with the performance of the examinees. This is most 
likely when the judges are in strong agreement about one or both of the 
eaKtreme cut-off scores and the range between the extremes is small, Decisions 
n ™ ed to ba about how to proceed if this disagreement between the judges' 

estimates and observed data should occur. It should aLio be noted that the 
IS^ne-segment drawn using the Hofstee method mid not iixtergict the point 
r^e^presenting the mean cut-off score and mean percent pm=ms obtained using the 
flesuk method. The Beuk method uses means and standard cttuviations , while the 
Hcs»£stee method uses only extreme values, 

Th^_e DeGrui-j tar Methnrt 

Th*« BeGruijter method is similar to the Beukitsthod. Each judge provides a 
ou^fc-off score and an expected pass rate. Additionally, however, judges must 
pr<»vide estimates of their confidence in their ratings oil both the absolute 
an=a relative dimensions (or the ratio of their confidence in those ratings). 

Page 7 



11 



ERIC 



January 26, 1987 



Ths PeGruijter method identifies the one member of a family of ellipses that 
just touches the ©give, The family of ellipses is defined by the equation: 



2 9 0 0 

o i' K o V KX ' 



where d is half of the length of the ellipse in the vertical 
direction, 



r is the ratio of the judges* uncertainty with respect to 

the true value of £ to their uncertainty about the true 

value of c. (u^/u ) t 
1 £ c 

c o is an observed cut-off (a test score) t 

is the ideal cut-off (from the cut-off score study) , 

f_ is the observed failure rate at a , and 
o □ 

is the ideal failure rate (from the cut-off score 
study) , 



The values of and £ that yield the smallest value of d define the one 
ellipse in the family of ellipses from Formula 1 that just touches the ogive, 
^o ia then taken as the adjusted eut-off that provides the best compromise 

Page 8 



ERLC 



12 



January 26, 1987 



between the absolute and relative cut-off scores. The point a ,f does not 

© o 

have to be a whole number, but the equation may be solved using whole numbers. 

The ratio of the uncertainty estimates determines the amount of 
compromise required along the cut-off score and failing percent continua. A 
ratio greater than 1.0 (the judges were more uncertain about the estimates of 
the failing rates than they were about the estimates of the cut-off scores) 
will result in an ellipse in which the vertical axis is the major (longer) 
axis. This type of ellipse will tend to result in a larger discrepancy 
between the adjusted failure rate and the judges' estimated failure rate than 
between the adjusted cut-off score and the judges' estimated cut-off score. A 
ratio less than 1.0 will have the opposite effect; the horizontal axis will 
be the mag or axis. 

Procedures and Instruments 

Two tests were included in the study. Both tests are included in the 
same certification testing program, but cover different content areas. 
Separate panels of judges were convened to rate the two tests. The Angoff 
ff.^thod was used to obtain ratings of item difficulty for the items in each 
test. Prior to the rating of test items, each judge responded to the 
following question: If the test were a perfect instrument for measuring 
exactly what candidates knew, what percentage of candidates taking the test 

Page 9 



13 



January 26, 1987 



should pass? Each test contained 34 four-option multiple-choice items. 
Descriptive information about the tests is shown in Table 1 . 

The data collected were directly applicable to the Beuk method, but the 

the collection of the additional data required for the DeGruij ter and Hofstee 

method was not included in the study design. To demonstrate the DeGruij ter 

method, the ratio of the standard deviations of the absolute and relative 

i 

cut-offs was used as a proxy for the ratio of the uncertainty estimates. The 
outcome of this decision was that the DeGruij ter results are very similar to 
the Beuk results. To demonstrate the Hofstee method, the highest cut-off 
resulting from the Angoff data collection was used as the maximum acceptable 
cut-off, the lowest Angoff cut-off was taken as the minimum acceptable cut-off 
and the maximum and minimum failure rates were set from the values obtained 
from the judges' responses to the expected percent pass question. 



Results 

The Angoff cut-off scores and the expected pass rates are shown in Table 
2, The group rating Test 1 can be described as "test oriented", The standard 
deviation of the absolute cut-off scores is approximately one-third as large 
as the standard deviation of the expected pass rates (6.13 and 16.71 
respectively) . The standard deviations for the group rating the second test 



Page 10 



January 26 , 1987 



ware much more similar (2*34 for the absolute cut-off and 1,91 for the passing 

Tables 3 and 4 show the Beuk, DeGruijter f and Hofstee results for Test 1 
and Test 2 respectively. The tables contain data only for the range of scores 
within which the cut-off could conceivably be expected to lie. The columns in 
the tables were derived as follows i 

Cut-off score - each possible test score was used as a 



potential cut-off in the calculations 



% Below 



the percentage of the examinee group below the 



raw score 



Beuk ¥ 



the value Y in the equation Y aX + b where a is 



the slope (ratio of the standard deviations 



multiplied by -1), b is the intercept and X is the 



raw score expressed as a percent 



Beuk Diff 



the Beuk Y * % Below: the discrepancy between the 



observed failure rate and the Beuk value 



DeGruijter d 



explained previously under the DeGruijter 



method 



Page 



11 



ERLC 



15 



January 26, 1987 



Hofstee Y - the value Y in the equation Y - aX + b where a is 

the slops (determined from the two given points), 
b is the intercept and X is the raw score 
expressed as a percent. The line has end points as 
described in the Hofstee section of the paper, 

Hofstee Diff - the Hofstee Y - % Below: the discrepancy 

between the observed failure rate and the Hofstee 
value 



The suggested cut-off can be determined by locating the number with the 
smallest absolute value in the difference column for Beuk and Hofstee and in 
the d column for DeGruijter. For both tests, the application of each method 
resulted in a two point drop in the cut-off score, The initial cut-off, based 
on the judges' Angoff estimates was 22 items correct. The compromise methods 
reduced the cut-off to 20, 



These results are depicted graphically in Figures 1 and 2 for Test 1 and 
Test 2, respectively. In Figure 1, the independence of the Hofstee line from 
the results of the Angoff method shows clearly. Although the same number 
correct score is obtained when the results are rounded, the Hofstee line does 
not include the point defined by the two means, The DeGruijter ellipse shown 
in Figure 1 is elongated along the vertical axis. As noted previously, this 

Page 12 

16 



January 26, 1987 



represents the greater spread of estimates for the relative cut-off than for 
Che absolute cut-off. 

Figure 2 shows a DeGruijter ellipse that is slightly elongated along the 
horizontal dimension* The uncertainty ratio in this case was less than unity 
since there was greater variance in the absolute than the relative cut-off. 
Also, the difference In the standard deviations was less for Test 2 than for 
Test 1 % resulting in a more circular ellipse. This figure shows a very short 
Hofstee line- segment , Judges were in close agreement as to the relative 
cut-off. Their agreement was not g however, sensitive to the data. The result 
was a short line -segment that did not cross the cumulative distribution, 
Therefore f for these data the Hofstee method did not yield a cut- of ^ scor-_. 

In order to better illustrate the DeGruijter method, three ellipses are 
shown for Test 1 and Test 2 in Figures 3 and 4 ( respectively. The three 
ellipses shown in each figure are: the ellipse yielding the compromise 
cut-off and the ellipses based on the data one score point above or one point 
below that value. In Figure 3 ? the ellipse drawn using the data from the 
compromise cut-off value actually touches the cumulative frequency 
distribution at that point, while the other two ellipses are clearly larger 
and intersect the distribution at two points. In Figure 4 S it can be seen 
that the ideal compromise would be a non- Integer value since the compromise 
ellipse actually intersects the cumulative distribution at two points. 

Page 13 



17 



January 26, 1987 



However, the smallest ellipse drawn using integer values is used as the 
compromise cut-off since the score scale is an integer scale. 

These two figures demonstrate how the shape of the ellipses and the 
resulting modification of the eut»off are affected by the magnitude of the 
uncertainty ratio. In Figure 3, the ellipses are elongated in the vertical 
axis. The uncertainty ratio for that test was 2,73, For this test, the 
judges estimated the fail rate to be 33 percent and the eut-off score to be 65 
percent (i.e # 22 items out of 34) . The adjusted fail rate was 45 percent and 
the adjusted cut-off score was 59 (i.e. 20 items out of 34). The difference 
between the original and adjusted fail rates was 12 percent, compared to a 
difference of 6 percent for the absolute cut-off scores, consistent with an 
ellipse that has the vertical axis as its major axis. Figure 4, on the other 
hand, was generated with an uncertainty ratio of 0,82 from Test 2 and 
indicates an elongation along the horizontal axis. As expected with this type 
of ellipse, the difference between the ideal and compromise cut-offs (65 and 
59 percent respectively) is greater than the difference between the ideal and 
adjusted fail rates (29 percent in both cases). 

Discussion 

An evaluation of the usefulness of the three compromise methods can be 
made by considering the effect they have on the passing rate. For Test 1, 
implementation of the judges' initial cut-off (a raw score of 22 or 65 percent 

Page 14 



18 



January 26, 1987 



correct) would have provided a 33 percent passing rate. The compromise methods 
raised that rate to 55 percent by lowering the cut-off to a raw score of 20 
(59 percent correct) . The 55 percent pass rate following the adjustment is 
within twelve percent of the average desired passing percent (67 percent) for 
that test, One judge rated the expected passing rate for this test as 33 
percent, Although, in practice one would not remove a judge simply on the 
basis of a large discrepancy f it is instructive to note the effect of that 
judge, Without the data for the sixth judge , the expected raw score cut-off 
remains 22 . The expected pass rate is raised from 67 to 72.5 percent. The 
standard deviations (in terms of percentages) for the cut-off score and the 
pass rate are 6.72 and 8,22 respectively. These data would result in a three 
point drop in the cut-off and a final pass rate of 65 percent. 

For Test 2 5 the desired passing percentage was 71, The initial cut-off 
(a raw score of 22) resulted in a 50 percent pass rate, Following application 
of the compromise methods, the passing rate at the resulting raw score cut-off 
of 20 was 71 percent. 

The results of this preliminary investigation are encouraging. Although 

the methods provided the same results, this is probably due to the dependency 

of the Hofstee and DeGruijter results on data collected for the Beuk method 
I 

Nonetheless, the methods all provided compromise cuts resulting in passing 
rates that wore reasonably close to the rates specified by the judges, 

Page 15 



19 



January 26 , 1987 



By inspecting Tables 3 and 4, it can be seen that the portion of the 
score scale containing the original and compromise cuts is an area in which 
many examinees lie. Thus, the effect of a change in the cut-off of even a 
single point on the overall pass rate is substantial. Had the cut-off been in 
a different portion of the distribution, the results may not have been as 
striking , 

This study suggests that further research into all three of the 
compromise methods is needed, A study which compares the methods when data 
have been collected specifically for each method may better clarify 
differences among the methods than this study which was intended primarily to 
demonstrate differences in the ways the data are treated. Additional studies 
will be also required to investigate the sensitivity of the methods to various 
combinations of score distributions and placement of the cut-off within those 
distributions. As further research is conducted, important information about 
the conceptual and practical attractiveness of the methods to client groups 
will also become available. Consideration should be given t© research 
concerning more effective methods of acquiring passing rate data, including 
what information should be provided to judges and how to establish uncertainty 
estimates . 



Page 16 



20 



January 26, 1987 



Table 1 

Descriptive Information for the Two Testa 



Test 1 Test 2 



Number of Items 


34 




34 




Number of Examinees 


784 




228 




Mean 


19, 


81 


21 


32 


Standard Deviation 


4. 


20 


4 


11 


Skewness 




19 




89 


KR-20 




62 




60 


SEM 

SEM (conditional) 


2. 


59 


2, 


60 


2. 


67 


2. 


64 



*Lord t F.M, (1984) 



Table 2 

Estimates of Absolute Cut-off Scores and Expected Pass Rates 



Test J Test 2 



Judge 


Cut-off Score 




Expected 


Cut-off Score 


Expected 


(raw) 




Fail Rate 


(raw) 


Fail Rate 


1 


18,20 




30 


20,85 


30 


2 


24,40 




35 


20,85 


30 


3 


20,95 




30 


22.95 


28 


4 


20,70 




20 


22.00 


30 


5 


22,05 




35 


20.95 


30 


6 


21,80 




67 


21.85 


25 


7 


23,90 




15 


21,10 


30 


Mean 


21.71 (63 


87%) 


33 


21.51 (63. 


26%) 29 


SD 


2.08 ( 6. 


13%) 


16,71 


0.79 ( 2. 


34%) 1,91 



Page 17 



21 



January 26, 1987 



Table 3 

Results off the Application of the Beuk, DeGruijter, and Hofstee 
Methods for Adjusting Cut-off Scores for Test 1 



Raw 
Score 


Actual 
P-.rcent: 
Below 


Beuk Values 
Y Diff 


DeGruijter 
d 


Hofstee 
Y 


Values 
Diff 


15 


10 


87 


77 


58 






16 


14 


79 


65 


49 






17 


21 


71 


50 


40 






18 


28 


63 


35 


30 


69 


41 


19 


35 


55 


20 


22 


60 


25 


20 


45 


47 


2 


18 


52 


7 


21 


57 


39 


-18 


25 


44 


-14 


22 


67 


31 


-36 


34 


35 


-32 


23 


74 


23 


-52 


42 


27 


-48 


24 


81 


15 


-66 


51 


18 


-63 


25 


88 


7 


-81 


60 






26 


92 


-1 


-93 


68 






27 


95 


-9 


-104 


75 






28 


97 


-17 


-114 


81 






29 


98 


-25 


-124 


88 






30 


99 


-33 


-132 


93 







Page 18 

22 

ERIC 



January 26 , 1987 



Table 4 

Results of the Application of the Beuk f DeGruij ter p and Hofstee 



■ 


Method 


s for 


Adjusting 


Cut-off Scores 


for Test 


2 




Actual 






_ _ _ _____ 


_____„___, 


_______ 


Raw 


Percent 


Beuk 


Values 


DeGruij ter 


Hofstee 


Values 


Score 


Below 


Y 


Biff 


d 


Y 


Diff 


15 


6 


45 


39 


28 






16 


10 


42 


33 


23 






17 


13 


40 


27 


19 






18 


16 


37 


22 


16 






19 


21 


35 


14 


10 






20 


29 


33 


4 


4 






21 


40 


30 


-10 


11 


30 


-11 


22 


50 


28 


-22 


21 


27 


= 23 


23 


61 


25 


-36 


33 


25 


»37 


24 


72 


23 


-49 


43 






25 


78 


21 


-57 


49 






26 


86 


18 


-67 


58 






27 


89 


16 


-74 


62 






28 


94 


13 


-81 


67 






29 


96 


11 


-86 


70 






30 


98 


9 


-89 


72 







Page 19 



23 



January 26 , 1987 



References 

Bejar, 1,1, (1983). Subject matter experts' assessment of item 

statistics. Applied Psychological Measurement^ 7^ 303 = 310, 

Beuk, C,H, (1984), A method for reaching a compromise between absolute 
and relative standards in examinations . Journal o^Bduoational 
Measurement » 21 . 147-152, 

DeGruijter, D*N,M, (1985), Compromise models for establishing 

examination standards. Journal of EducationaLMeasurement . ^.2. 
263-269, 

Lord, F,M, (1984) Standard errors of measurement at different ability 
levels. Research Report 84-8. Princeton. N J : Educational Testing 
Service , 

Lorge, I, & Kruglov, L,K, (1953) The improvement of estimates of test 

development , Educational _ and Psychological Measurement, 13. 34 - 46 . 

Livingston, S.A, & Zieky, M.J. (1982) Passing Scores; A Manual for Setting 
Standards of Performance on Educational _and^ Occupational_Tests , 
Princeton, NJ: Educational Testing Service. 

Thorndike, R,L, (1982) Item and score conversion by pooled judgment. In 

Holland, P.W. & Rubin, D,B, (eds.), Test Equating . New York: Academic 
Press, 309-317, 



Page 20 



24 



Figure 1: Compromise Results for Test 1 
Beuk. Hofstee, 8c DeGruijter 




Figure 2* Compromise Results for Test 2 
Beuk, Hofstee, Be DeGruijter 



100 




Perc«ni Correct 



26 

Beuk = Short Dashed lint 
DeGruijter « Solid Ellipse 
Hofstee — Long Dashed line 
Cumulative % Below - Solid Ogive 



ERIC 



Figure 3: DeGruijter Ellipses for Test 1 
Raw Cuts 19, 20, Be 21 




Figure 4; DeGruijter Ellipses for Test 2 
Raw Cuts 19. 20, &c 21 



5 OH 



45- 



P 

e 40H 

r 
c 
e 

n 35- 
t 

o 

f 30-J 



£ 

a 25-J 
m 
1 
n 

m 20H 

3 

isH 



10- 



40 




55 60 
Percent Correct 



l l |lllllllll|I I IIHIII|ll 

65 70 75 



T 
SO 



19 M Short Dashed Ellipse 
Sew* 20 - Solid Ellipse 
Scot* 21 — Lone Bashed Ellipse 
Cumulative % Below - Solid Ogive 



28 



ERIC 



